Evaluation of voice activity detection by combining multiple features with weight adaptation

نویسندگان

  • Yusuke Kida
  • Tatsuya Kawahara
چکیده

For noise-robust automatic speech recognition (ASR), we propose a novel voice activity detection (VAD) method based on a combination of multiple features. The scheme uses a weighted combination of four conventional VAD features: amplitude level, zero crossing rate, spectral information, and Gaussian mixture model (GMM) likelihood. The weights for combination are adaptively updated using minimum classification error (MCE) training. In this paper, we first investigate the effect of adaptation of the combination weights and GMM parameters, and demonstrate that the weights can be effectively adapted with a single utterance. Then, we present application of the method to ASR. It is confirmed that the proposed method significantly outperforms conventional methods in various noise conditions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Voice activity detection based on conditional random fields using multiple features

This paper proposes a Voice Activity Detection (VAD) algorithm based on Conditional Random Fields (CRF) using multiple features. VAD is a technique used to distinguish between speech and non-speech in noisy environments and is an important component in many real-world speech applications. The posterior probability of output labels in the proposed method is directly modeled by the weighted sum o...

متن کامل

A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)

Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...

متن کامل

Evaluation the Mean Performance and Stability of Rice Genotypes by Combining Features of AMMI and BLUP Techniques and Selection Based on Multiple Traits

Additive main effect and multiplicative interaction (AMMI) and best linear unbiased prediction (BLUP) are two methods for analyzing multi-environment trials (MET). In this study, seven selected rice lines were evaluated along with two check varieties based on randomized complete block design in Tonekabon, Amol and Sari (Iran) in three growing seasons of 2011-14. To quantify the genotypic stabil...

متن کامل

Automated Detection of Multiple Sclerosis Lesions Using Texture-based Features and a Hybrid Classifier

Background: Multiple Sclerosis (MS) is the most frequent non-traumatic neurological disease capable of causing disability in young adults. Detection of MS lesions with magnetic resonance imaging (MRI) is the most common technique. However, manual interpretation of vast amounts of data is often tedious and error-prone. Furthermore, changes in lesions are often subtle and extremely unrepresentati...

متن کامل

Improving Deep Neural Network Based Speech Enhancement in Low SNR Environments

We propose a joint framework combining speech enhancement (SE) and voice activity detection (VAD) to increase the speech intelligibility in low signal-noise-ratio (SNR) environments. Deep Neural Networks (DNN) have recently been successfully adopted as a regression model in SE. Nonetheless, the performance in harsh environments is not always satisfactory because the noise energy is often domina...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006